-
Notifications
You must be signed in to change notification settings - Fork 322
Add an example using Optuna and Transformers #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thanks for your work! However, I don't think its all that different from the current hyperparameter search docs in Transformers except its a more complete example. @merveenoyan @sergiopaniego what do you think? |
Just for the record, I'd actually wanted to include support for the transformer's library in their optuna-integration package. But since there is backend support provided by the transformers library, I contributed an starting example to their repo. This PR builds on that example and provides a more hands-on approach for users to understand how to apply HPO to transformer models 🙂 |
@ParagEkbote cookbook mostly contains end-to-end applied AI recipes where library integrations shine 💫 rather than minimal examples. it would be great to make it a more applied ML type of recipe |
…sh to hub to make it more applied.
I have now added the following improvements to the recipe to make it more applied:
Could you please review the changes? cc: @stevhliu, @merveenoyan |
What does this PR do?
In this end-to-end tutorial, we are going to utilize the optuna library to perform hyperparameter optimization on a BERT model using the IMDB dataset.
Firstly, we will load and preprocess the dataset and define the model we want to perform HPO on. Then, we shall set the metrics and wrap it inside the trainer class along with a search space that will search the best set of hyperparameters for the learning rate, weight decay and batch size. Lastly, we will visualize the results as well.
Please let me know if any modifications are required and I will make the necessary changes.
Who can review?
@stevhliu.